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[57] ABSTRACT 

An interconnect controller for use in an arbitrary topology 
collection of nodes in a network suitable for use for both 
data sharing and distributed computing. The interconnect 
controller provides four (4) serial ports and two (2) parallel 
ports for communicating with adjacent nodes in a network. 
Linked ports between two nodes provide a continuous 
stream of information with idle packets filling non-data 
transfer cases. The logic of the interconnect controller 
provides for adaptive routing and to topology independence 
and allows for the sharing of a common clock for synchro- 
nizing the packet transmission. 
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APPARATUS AND METHOD FOR may not be unduly delayed and must be delivered in 

CONTROLLING POINT-TO-POINT sequential order Other information such as routine data file 

INTERCONNECT COMMUNICATIONS information may be conveyed piecemeal with errors cor- 

BETWEEN NODES rected out of sequence. These problems are further compli- 

5 cated in networks utilizing a common bus which must be 
This is a Continuation Application of application Ser. No. arbitrated for, which arbitration may become overly burden- 
08/101,839, filed Aug. 4, 1993. some with an increased number of nodes in the system. 

Finally, reconciling the differing needs of local area network 
BACKGROUND OF THE INVENTION communications and DSM communication into a single 

1. Field of the Invention 10 P rotoco1 ^ heretofore provided a daunting task. 

The present invention relates generally to communica- , Communications Architecture 

tions between computers. More particularly, the present {? r . Multi-Processor Networks at Carnegie Mellon 

invention relates to^nt-tc>point iTZlTr T ? ^V*,** T^V* 

tions technologies for use man arbitrarily assembled com- 15 * e *™ ™^ on f SQribes a theoretic^ unplemenation 

puter network. 15 of a distnbuted shared memory system. That dissertation is 

„ ■ , . _ , incorporated herein by references. 

2. Description of Related Art 

The evolution of computer technology has seen the pro- SUMMARY OF THE INVENTION 

gression from huge room-sized collections of tubes to desk- ^ Hght of ^ f ^ ft ^ appreciated that there 

ttpandeven^ 20 fa g ^ fa a JjJ 5 ^ me 

LTr^^^^ transrnissionofdaubetwee^ 

ii« «*u uu mucMuig uk jjuwa -uiu syecu ui uugic use m a wl(Je c communication, activities. It is 

autonomous mactaes. This has led to today's stand-alone Aad£ an object of the present invention to provide a 

machines which have awesome computaUonal and data method an „ for c ^ veying iMbxma J btCmea 

processing power. nodes in a given network of nodes suitable for use in both 
Relatively recent efforts in the computer field have been traditional data sharing network operation as well as for 
directed toward the sharing of data from more man one more traffic-intensive shared-memory type applications, 
computer station Other efforts have been directed toward ft is ^ ^ objcct of ^ t to idc an 
the use of multiple processors in a single computer to mterconnect technology based on a distributed switch con- 
enhance the speed and power of single machines. 30 ccpt tQ a flcxi > y expandable network . 

Currentresearch has been directed toward combining the objcct rf ^ t fa ^ 

above efforts to yield powerful computer systems composed ^ ^ ^ which is independent of the 

of a Duality of otherwise stand-alone machmes. For some £ rf ^ mu$ ovidin f * „ ^ 

time, high speed local area networks have been used to link topology network, 

many computers to facilitate data transfer between multiple V\ \ 

autonomous units. Modem offices use such networks to ft 15 ^° an ***** of me oreseat mention to provide a 

greatly increase the movement of information between users messa S e roatin g mechanism utilizing a common buffer pool 

without increasing the use of paper. Similarly, such networks fo * me de * x>sit md rece * t of information packets for use by 

provide alternative conimunication mechanisms between the ^ P«>rts of a communications node, 

network's users in the form of electronic mail and the 11 is another object of the present invention to provide 

provision of public forums for common discussion. priorities for data packets thus enabling Isochronous data 

A network utilizing an efficient communications protocol for real-time information, 

may be used for both data sharing and for implementing the Another object of the present invention is to provide a 
concept of a Distributed Shared Memory System (DSM). 45 temporal alignment buffer to provide an adjustment in the 

Unlike a local area network which is motivated by the need round-trip delay for packets between given nodes thus 

to share data, a DSM is motivated by combining multiple ensuring an integral multiple of packet transmission times 

processors into one large system with the potential for using between two nodes. 

the aggregate resources for any given application. A number It is another object of the present invention to provide a 
of different methods have been explored for sharing com- ^ distributed phase-locked loop between nodes in a given 

puter resources in a given network collection. network to provide for the synchronization of packet trans- 

Whether a network is assembled purely to serve a network missions between nodes, 

function of sharing data or the more complex case of It is also an object of the present invention to provide a 

combining computing resources, it is essential that informa- method of initializing a network to determine round-trip 
tion from any one system in the network be able to be 55 delays between adjacent nodes as well as breaking cycles to 

conveyed to any other system in the network. There have avoid deadlock situations. 

been many protocols developed for different implementa- Yet another object of the present invention is to provide an 

tions many utilizing a centralized switch. Most have adaptive routing mechanism to increase the efficiency of 

required a priori defined locations and addresses for each packet transmissions between non-adjacent nodes in a net- 
member of the network, or nodes. (Note that a given system so work. 

in a network may in some cases house more than one node.) These and other objects of the present invention are 

This predefined nature of the network impedes the ease with provided by an interconnect controller which facilitates 

which additional elements may be added or existing nodes conamunications between given nodes in a network. The 

removed. interconnect controller comprises four (4) serial ports and 
Other problems to contend with include the need to 65 two (2) parallel ports. Each serial port has channel module 

prioritize certain types of data transfers. Isochronous data logic circuitry for conveying signals between a correspond- 

transf ers for real-time information such as video and sound ing channel module on an adjacent node. Information is 
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conveyed between nodes over links in a conveyor belt FIG. 4 illustrates an arbitrary topology collection of nodes 

fashion with idle packets being inserted when no other forming a network. 

information is being transferred. The delay time in trans- nG 5 mnni3B$ a block ^ &am of ^ interconnect 

mission is adjusted by a temporal alignment : buff* in the comroUer of ^ ^tiou. 

channel modules to ensure that an integral multiple of packet 5 m ^ , .„ r 

transmission times are used for the total delay time thus mG ' ? ^trates graphically the layout for a given 

providing a mechanism for predicting the arrival and start of communications packet identifying the various fields of bits, 

each packet transmission. The four channel modules on a FIG. 7 illustrates a more detailed block diagram of the 

single interconnect controller chip share a common buffer interconnect controller of the present invention, 

pool with linked list entries for identifying which channel nGSt m and m ^ a ^ ^ representation of the 

module is to propagate each received packet The common common buffer pool utUized by ±c interconnect controller 

buffer pool is segmented into sixteen (16) bit segments so of ^ nt m ;, ention> 

that received packets may begin retransmission before com- _ r , 

pleting arrival. This also eliminates extra registers which mG - 9 ^strates two channel modules coupled for the 

otherwise would be required to convert to an 80-bit parallel purpose of illustrating a packet exchange from a sending 

interface. In addition to data, data packets include informa- 15 channel module to a receiving channel module as carried out 

tion about packets including destination and error correction by me interconnect controller of the present invention, 

information as well as prefix bits identifying acknowledg- FIG. 10 is a more detailed illustration of the channel 

ments and other control information. module portion of the interconnect controller in accordance 

The interconnect controller of the present invention with a preferred embodiment of the present invention, 

includes a routing table mat is filled during network initial- 20 HG u is ft mQIQ mustration of ^ channd 

* D J° n wh ? 1 j" mcQm ^ ^ifies its Penance subsystem portion of theinterconnectcontrol- 

destination, that destination can be looked up in the routing , nf ™™/;™««L.. 

table to determine which channel module and output port to ler of ™ prcsent 1DVenUon - 

use for continuing the packet on its way. When more than 12 demonstrates graphically the conveyor belt nature 

one output channel may be used, the interconnect controller 25 °^ P 10 * 25 * ^ ots between two coupled channel modules in 

logic will determine the channel with least traffic thus accordance with the communications protocol of the present 

providing an adaptive routing mechanism to increase the invention. 

efficiency of the interconnect system. The routing table used fig. 13 illustrates a portion of the tuning circuitry to be 

also provides for the assembly of an arbitrary topology utilized by the interconnect controller of the present inven- 

graph which may receive new nodes during operation of the 30 tion. 

Sy S? 1L * . . „ . FIG. 14 illustrates a typical frequency/phase comparator 

The parallel ports of the interconnect controHer may be ^ ^ bc uscd with ^ nt invcnt £ n , ^ 

used to link the interconnect controller to a local host as well CTr , - - . ^ . . 4 . . A 

as to one or two other interconnect controller chips thus . ™- 15 ^ mo ° s ^ s ™V buffer of the present 

providing for a switch with up to twelve (12) serial output 35 ™tam m a phase/frequency comparator for use in clock 

channels at a given node. sync omzui S 

The channel module logic circuitry of the interconnect na 16 is a flowchart of *e initialization procedure used 

controller provides for a distributed phase-locked loop b * me c ^ nel modules ^ * e interconnect controller of the 

mechanism as well as a clock information sharing media- P reseilt invention. 

nism. 40 17 illustrates graphically the intended flow of the 

Data packets comprise 80-bit words which can be com- initialization procedure as carried out by the software 

bined into quad packets for the conveyance of 32-byte line attached as Appendix A. 

cache-size words The channel modules may also mclude DETAILED DESCRIPTION OF THE 

scnalizers and deserializers for the processing of data. ^^^INVTimON 

Virtual channels are implemented to eliminate deadlock 45 

problems. Each point-to-point channel has a Master/Slave- An apparatus and a number of methods are described for 

bit which is set during initialization such that each end can use in a communications protocol between nodes in an 

identify itself as either a Master or Slave in a consistent arbitrary topology network. In the following description, 

manner. This Master/Slave bit can be used during the numerous specific details are set forth such as data packet 

network exploration phase to break symmetries. The Master/ 50 lengths and priority types in order to provide a thorough 

Slave bit is also needed during the probabilistic start-up understanding of the present invention. It will be obvious, 

protocol that establishes round-trip delay and word align- however, to one skilled in the art that the present invention 

ment Each virtual channel is provided with its own set of may be practiced without such specific details. In other 

input buffers to eliminate deadlock situations. instances, well-known control structures and gate level 

BRIEF DESCRIPTION OF THE DRAWINGS 55 ™* been ^ detail in order not to obscure 

unnecessarily the present invention. Particularly, many func- 

The objects , features and advantages of the present inven- tions are described to be carried out by various logic circuits, 

tion will be apparent from the following detailed description Those of ordinary skill in the art, having been described the 

in which: various functions, will be able to implement the necessary 

FIG. 1 illustrates a basic computer architecture which 50 logic without undue experimentation, 
may utilize the interconnect controller of the present inven- 
tion. Overview of the Computer System Incorporating 

FIG. 2 is a block diagram of the interconnect controller in ^ ^sent Invention 

accordance with the present invention. Referring first to FIG. 1, a typical computer systemforuse 

FIGS. 3(c)-3(fc) illustrate configurations of multiple inter- 65 in a data sharing or resource sharing network is illustrated, 

connect controllers to form larger switches in accordance While the nodes in a network need not be a computer station 

with the present invention. in all instances, a computer is used for illustrative purposes 
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in describing the interaction of a local host with the inter- station or some other facility. The local host 21 links to the 

connect controller of the present invention. Some nodes in network via the interconnect controller 20 through the 

a network need not be coupled to any local host and the parallel ports 22 and 23. In the preferred embodiment, ports 

interconnect controller of the present invention may operate 22 and 23 are 16-bit wide parallel ports. The interconnect 

as a switch or be connected to a dumb host such as a 5 controller is also provided with four bi-directional serial 

graphical display or a printer. P orts 24> 25> 26 and 27. Each of these serial ports is used to 

As shown in FIG. h there is computer 100 which com- ? mplc thc ktcrconi ? t cct ™ nt ™Uer from a given node to the 

prises three major components. The first of these is the inte, ; connect controUer of an adjacent node. Urns with a 

input/output (I/O) drcuklOl which is used to communicate ^K? ™l bC ^ 

informatiVn in appropriately structured form to and from w *° ^ each <> f which may be similarly linked 

other parts of the computer 100 as well as out of the , * J . . S [ 

computer 100. Also shown as part of the computer 100 is the ^ some situations, it may be desirable for a node to be 

central processing unit (CPU) 102 and memory 103. These f™"*^ 

two latter elements are those typically found in most general lm ^ d toa W 2 ' 3 < fl > |P«rates that two interconnect 
purpose computers and almost all special purpose comput- is contr ° Uers ^P 1 ^ P 0 * 5 to 
ers. In fact, the several elements contained within computer prcmdc a ™ dt Wlth an «W*™y switch. HO- 3(fc) fflus- 
100 are intended to be representative of this broad category ** **? (3) "f™""^ controllers of the present 
of data processor. Particular examples of suitable data pro- 1DVe ?*? n ^ * «niitaed to form a 12.way switch. Such 
cessors to fill the role of computer 100 include machines a swlteh bc uscM for fonnin S a ^ topology graph, 
manufactured by Sun Microsystems, Inc., Mountain View, 20 ^ foe typical situation a node will have a single inter- 
Calif. Other computers having differing capabilities may of connect controller with four external serial ports. Each port 
course be utilized with the interconnect controller of the bc used to CDU P le node to ^ adjacent node through 
present invention one of its external seria l ports- Ports may be coupled via 

Also shown in FIG. 1 is an input device 105, shown in a f^L^^^I" ^^tf^^ 

typical embodiment of asakeyWTWisaiso shown as 25 ^T^^V^^T^ T 
an input device a graphics tablet 107. It should be *J°™^ Cab |f ^ T f 0St ; 

understood, however, thai the input device may actually be f^wt^ t ^JTT^ w 

in any other well-known input device (including, of course. ^Tf^^r h ■ mCmp0 ^ f ' nterconne ? 
anothercon^uterXAniassmemorydevicelCMLoupledto 3** dnving curmby for different wmmuni- 

™ . .+ * M . .j jj.i . _ 30 cations media are generally well-known in the art and will 

I/O circuit 101 and provides additional storage capabilities u a ^ ^ *v j i. * * " 

for the computer lOoTlhe mass memory may include other ™* ^^"^f 64 ^fT" 

programs, fonts for different characters and the like and may ™ ^P* ""to*"* decoding and dock recovery 
take the form of magnetic or optical disc drive or any ate Z^MW ^ S , 1 ^ I ?^f ^ hk h 
wdl-known device. It will be appreciated that the data *™f ^"^ft that wdl not .be : described herem. Such 
retained within mass memory 104, may, in appropriate 35 mechanisms are left to designers to implement as appropn- 

cases, be incorporated in standard fashion into computer 100 a ' _^ « „. . . „ . . 
as part of memory 103. As shown in FIG. 1, the interconnect u Before 1D ^ interconnect controller of 

controUer 20 of the present invention is incorporated with me ™™t** : * 1S ^ t0 ^scribe the overaU 

the I/O circuitry 101 of computer 100. concept of a ^^alized switch network and the format of 

T , 4l «. , j , - . 40 the data packets exchanged between nodes in accordance 

In addition, toree typical computer display devices are with ^ of me t 

illustrated, the fcsplay monitor 10* the plotter 109 and a na 4 ^strates m OTnfi ^ ed of 

kser printer 11* Each can be used to display images or seven nodes , ^ dark ^ ^ mdicatc ^t- 

documents or other date > utilized by the computer 100. A t ^ interconnections between nodes, connected to one 
cmsor ohM device 106 such as a mouse^ trackball or ^ of ^ scrial ports of ^ nodcs at ^ tcimiDal cnd . ^ ^ 
stylus are also coupled to I/O circuit 101. O&er pointing be seen, nodes A and G each have only one adjacent node, 

^ZTl^Tlf * USCd aS , a ? Pr0r T te - ^^ mtCrCOn : nodes B, E and F each have two and nodes C and D each 
nec controller 20 of the present invention would in most nave ^ Duri gtem initialization ^ lo ^ c asS0ciate4 

likely circumstances be coupled to the I/O circuit 101 for with ^ nodc cach t0 to^fthe 

providing communications b^^en coroputer 100 and adja- 5Q of ^ ^ ^ ^ ^ 

cent nodes on the network though certainly alternative transmission delay time between nodes is calcu 

configurations may be appropriate. lated for use by the communications protocol. The initial- 

Interconnect Controller of the Present Invention i f ation P rocess and communications protocol will be 

described in more detail further herein. 

The interconnect controller of the present invention is 55 During normal operation adjacent nodes continuously 
intended to be the interface between a local host and the exchange data packets over links between corresponding 
interconnect system of a given network. In some cases a coupled ports. Each data packet is of a fixed length and takes 
given single station/location may comprise more than one a fimte amount of time to propagate toward an adjacent 
network node, having more than one interconnect controUer. node. The interconnect controller incorporated in each node 
Id other cases, an interconnect controller may be indepen- « intelligently adjusts the transmission delay between the 
dent of an intelligent local host. The interconnect controUer adjacent nodes to be equal to an integral number of packet 
may even stand alooe, operating solely as an intermediate transmission times. The mechanism for adjusting the trans- 
switch in a network. mission delay between adjacent nodes is called a temporal 

FIG. 2 iUustrates broadly the interconnect controUer 20 of alignment buffer and will be described in more detail further 
the present invention. In the preferred embodiment the 65 herein. Because the transmission delay between adjacent 
interconnect controUer 20 is a 6x6 switch coupled to a local nodes over a given link is equal to an integral multiple of the 
host 21. As described, local host 21 may bc a computer work time for launching a single data packet, each node knows 
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when to expect the header of each of the continuously fed ery fails. Finally, following the age field the remaining 48 

packets supplied to it. bits of information may be used for data. 

FIG. 5 illustrates in some more detail the interconnect The interconnect controller of the present invention rec- 

controller 20 of the present invention. Within the controller ognizes four levels of priority such that high priority traffic 

is the controller's logic 28 which directs the controller to 5 is not blocked by congestion at a lower priority level. At 

carry out many of the functions to be described in following least two levels of priorities are required to ensure that a 

sections. The controller 20 also maintains routing table memory coherency mechanism is deadlock free when the 

circuitry 29 which is coupled to the controller logic. The system is being used for distributed shared memory systems 

routing tables are filled during system initialization and are purposes. The remaining priority levels may be used to 

continuously updated during system operation by back- 10 su PP ort * mie critical communications such as video or audio 

ground running monitoring routines. There are a number of data traffic. 

methods known for computing the routing tables such as The appropriate channel module receives the 80 bits of 

those implementing the shortest path solutions suggested by the packet data and appends a 10-bit postfix. The first 9 bits 

a number of conventional textbooks. The operation of the of this postfix code are an error check code to be used by the 

present invention will be described assuming accurately 15 receiving channel module to verify the integrity of the data 

filled routing tables and continuously updated routing tables transfer. The remaining one bit of postfix data is an abort bit 

based on any of the selected number of known methods. that can cause the entire packet to be discarded. Following 

When a data packet is received at a node through one of the the 10-bit postfix data there is a 6-bit field that is a prefix for 

node's serial ports it includes information indicating the the following packet The prefix contains the virtual channel 

ultimate destination node for the packet The routing table 20 ^ a piggy-back acknowledgment bit, a quad packet com- 

for each node maintains information about which port a ponent id and a one bit sequence code. Thus, a packet in 

packet should be transmitted from in order to reach its transit between two nodes comprises 96 bits of informatiorL 

eventual destination in the most efficient manner. Upon receipt of the 96 bits, a receiving channel module 

FIG. 5 also illustrates four channel modules 30, 31, 32 and str iP s °ff the 16 appended bits. If the check code indicates 

33. Each channel module is associated with one of the serial 2s ***** *h e P ac ^ has suffered no errors and the flow control 

ports and controls the continuous exchange of packets information allows, the remaining 80 bits are supplied to the 

between the node and one adjacent node through the adja- interconnect controller of the receiving node either to be 

cent node's corresponding channel module. The channel routed to the next node on the way to its destination or are 

modules are responsive to control information that arrives supplied to the node's local host if the receiving node is the 

with each packet and maintain tables of information about 30 flnal destination for the packet 

pending transactions. The channel modules also house the As was described above, the preferred embodiment 
temporal alignment buffer used for adjusting transmission implementation of the point-to-point communications pro* 
delay times to equal an integral multiple of packet trans- tocol to be implemented has as an object the transferring of 
mission times. Each channel module also includes logic for data in 32-byte blocks (cache line size). Clearly an 80-bit 
counting the number of pending transactions over mat 35 packet cannot accommodate 32 bytes (288 bits) of informa- 
module. This allows the interconnect controller logic to tion. Therefore the concept of a quad packet is introduced, 
implement an adaptive routing mechanism. When a packet When data is being transferred (as opposed to idle packets 
may be transmitted over alternative channel modules en or control packets) four successive packets are used. Hence 
route to its ultimate destination, the interconnect controller the term quad packet. Only the first packet, the header 
logic is capable of determining which channel module has 40 packet of a quad includes the destination, priority, type, 
the least amount of pending traffic thus reducing the total source-id and age fields. The following three packets corn- 
latency for a packet in transit prise a full 80 bits of data. When quad packets are sent, the 
Referring now to FIG. 6, the bit assignment for a data prefix bits are used to identify that a following packet is part 
packet is illustrated. The use of the data packet by the of a <i uad body. Thus the receiving channel module knows 
interconnect controller will be described by simultaneous 45 that no destination address wiU be included and mat the quad 
reference to elements shown in the block diagram of FIG. 5. bodv packets follow the quad header. Quad body packets are 
The data to be transmitted from a node may either have also provided with the 16-bits of appended information 
originated with the node's local host or have been received including the check code and flow control information. If 
from another node en route to its ultimate destination. In any any one °f the quad packets has an error an error bit is 
event the interconnect controller has buffered the data to be 50 indicated and the entire quad packet will need to be resent 
transferred and conveys packets to the appropriate channel When a packet is transmitted from one node to another 
module 80 bits at a time. As shown in FIG. 6, the 80 bits between coupled channel modules, it is necessary for the 
comprise first a 12-bit destination address. During system receiving node to acknowledge receipt to the transmitting 
initialization, each node is assigned a unique address and node. There arc many reasons for this, particularly, to 
each routing table is supplied with each address identifying 55 indicate when retransmission is necessary due to errors, 
the next node in the route for packets toward each address. Another reason is because the sending node has only a finite 
Following the destination field there is a 2-bit field defining amount of storage for holding pending packet transactions 
the priority of the packet In the preferred embodiment four which should be flushed from its buffers upon completed 
levels of priority have been suitable for all transactions thus transmission. Point-to-point packet propagation time is typi- 
making two bits sufficient likewise, the 2-bit field follow- eo cally several times the length of packet transmission time, 
ing the packet priority provides for identifying four different Therefore, there will be multiple packets in transit 
types of packets that may be sent. Each of these will be concurrently, each of which will be treated independently, 
described in more detail in subsequent paragraphs. The Channels do not preserve transmission order, so the retrans- 
12-bit field following the packet type field provides the mission of a packet has no bearing on the state of the 
address of the node which was the source of a given packet 65 preceding or following packets. 

This is followed by a 4-bit age field. The age field is In the preferred embodiment, each link consists of two 

incremented upon certain conditions, such as when a deliv- physical connections that carry independent traffic in oppo- 
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site directions. Since transmission is slotted into equally holding them until the write port on the buffer pool is 

spaced packets in both directions, and because transmission available. Additional read and write ports are provided for 

rates are equal the return traffic carries the replies of the conveying parallel data to a local host 21 over parallel 

forward traffic (piggy-back acknowledgments). The round- transmission lines 22 and 23. 

trip delay for a packet that is returned immediately by the 5 in the block diagram illustrated in FIG. 7, each channel 

remote node determines the number of slots in a channel. module is shown with an associated channel maintenance 

This measured delay is a critical constant for each channel, subsystem. Channel maintenance subsystem 34 is associated 

because it is used to avoid multi-bit sequence numbering or with channel module 30, channel maintenance subsystem 35 

other means to match replies with outbound traffic For a is associated with channel module 31, channel maintenance 

given channel delay of <n>, the reply is expected in the 10 subsystem 36 is associated with channel module 32, and 

<n>th received packet This mechanism for each channel is channel maintenance subsystem 37 is associated with chan- 

aring of <n> independent slots. The same handshake logic nel module 33. The channel maintenance subsystems 

is applied to each slot without taking the state of adjacent include the low level function associated with bit serial data 

slots into account, which greatly simplifies the protocol transmission that will be described further herein with 

processing. The maximum number of slots is dictated by the 15 respect to FIG. 11. 

amount of independent storage for each slot's state which is When a channel module attempts to write received data 

a limitation on the maximum channel distance. Longer into me P^ c ^ buffer pool, the 12-bit address is simulta- 

distancesare possible, if channel bandwidth is reduced. In a neously supplied to the routing table circuitry 29. The 

such a case, idle packets are inserted into all slots for which r0 "??8 }* blt out P ut « an 8-bit word that specifies 

no state storage is available. wluch channel may be used for the packet This word 

Tin. i * • * £. * * * a xi j *_ is interpreted by the buffer control logic 41. The buffer 

When a packet u sen from node A to node B, node B eaBt ^ c 4 iLmtains a linked list index to reregisters 

upon receiving die packet sends its acknowledgment to node of ^ packet buffer pool. Various registers may be free or 

A. The acknowledgment is conducted by setting a bit in the ^cvpicd at different times irrespectivVof their actual loca- 

postfix portion of the returning packet being sent from point tion in the register file. The linked list index provides 

B to point A at the time the message from point A to point 25 head-to-tail linked list pointers for all stored data packets 

B is received (the packet occupying the same slot). The two and is used to index the packet buffer pool 40. By using the 

messages need not have anything in common other than the multi-ported register file that is accessible by all channels, 

fact that the message from node B to node A occupies the each channel module may deposit received packets without 

same slot on the packet "conveyor belt" between the two requiring dedicated storage for each channel. 

ncH * es ' 30 As will be described, the channel modules each may 

The value of the bit to be set in the piggy-back acknowl- include a serializer and deserializer for converting a 1-bit 

edgment is a function of a value sent in the transmitted serial data stream into 16-bit packets and vice versa. The 

message from node A to node B. In the postfix control data encoding method used (such as 8b/10b coding) must support 

bits, there is a bit called the msgSeqNo. Node A when three special code words, named CI. C2, C3, that cannot 

sending packets to node B alternates the value of this 35 appear in the data stream. These out-of-bound code words 

message between 0 and 1 for each successive transmission. are detected by the deserializer and are used to synchronize 

Node B, in acknowledging the accurate receipt of a packet the channel maintenance subsystem during initialization, 

from node A sets a bit called rspSeqNo in the return packet Three of these codes are used by the channel maintenance 

being sent to node A equal to the opposite value from the subsystem to establish the round-trip delay and proper 

msgSeqNo value of the received packet Node A upon 40 packet synchronization to be described with respect to the 

receiving a packet in the appropriate slot will compare the initialization procedure flowchart of FIG. 16. Data will be 

rspSeqNo of the received packet to the msgSeqNo of the exchanged between the common buffer pool and the channel 

previously sent packet If the values are toggled, then the modules 16 bits at a time. FIG. 8(«) is a more graphical 

channel module at node A need no longer store the original illustration of the packet buffer register file arrangement In 

packet which was sent from node A to node B because it has 45 the preferred embodiment, there is provision for 64 80-bit 

been conveyed with no errors. Thus, piggy-back acknowl- packets with each packet being stored in blocks of 16-bit 

edgments are used without requiring independent acknowl- double words. Thus, in some circumstances when one chan- 

edgment messages being routed between nodes. nel module receives a packet, the packet can be written from 

FIG. 7 illustrates the interconnect controller of the present the receiving channel module to the buffer pool 16 bits at a 

invention with the interconnect logic shown in more detail. 50 time while it may begin being retransmitted through a 

Each interconnect controller is synchronized via an internal second channel module on its way to the ultimate destina- 

timing circuit 42. Hie details of an interconnect controller's tion. Likewise, if the receiving node is the ultimate desti- 

timing logic are described further herein with respect to FIG. nation for the packet, the packet buffer logic may immedi- 

14. The interconnect controller utilizes a common buffer ately convey the packet to the local host 16 bits at a time 

pool for all communications channels. The common buffer 55 while the packet is still being received 

pool comprises the packet buffer pool register file 40 and the While the packet register buffer is a multi-port register file 

packet buffer control logic 41. Each of the channel modules that can hold 64 packets of 80 bits each, logically it appears 

is coupled for reading and writing to the common buffer as a set of queues for each virtual channel and priority level, 

poot In the preferred embodiment of the present invention, The file is, as described, physically subdivided into five 

the packet buffer pool is multi-ported with channel modules 60 banks so that packets can be inserted and removed in 

30 and 31 being coupled to one read port and channel quantities of 16 bits. FIG. 8(J>) illustrates the relationship of 

modules 32 and 33 being coupled to a second read port This the five subdivided banks with each of the serial and parallel 

facilitates the transmission of data from two channels of the channels. The location in the register file to store the 

node simultaneously. There is only a single write port to the incoming packet is computed in advance given a bit vector 

packet buffer pool which is coupled to the 4-channel mod- 65 representing the free buffers. Additional buffer reservation 

ules. Channel modules can all receive data at the same time logic maintains separate buffer pools depending on the 

but each are equipped with buffers for receiving packets and virtual channel and packet priority. 



02/12/2004, EAST Version: 1.4.1 



5,754,789 

11 12 

At the write port of the first memory bank, a newly- dure when the round-trip delay is calculated and the packet 
received packet is input to the router control logic. The first transmission time is known. A depth of 0-5 is sufficient 
16-bit piece of data contains the destination address, priority because a six clock cycle pipeline is utilized by the preferred 
and type. Concurrently with the write operation, the desti- embodiment interconnect controller. At the receiving dam- 
nation address is used to perform a routing table lookup in 5 ncl mo dule> the 16-bit postfix is stripped off with the check 
routing table circuitry 29. As described, the routing table code being confirmed at the CRC check code circuit 62. The 
en^consistsof a^^ six bits m thcn pro^d by ±c lo ^ c55t 
modates all combinations of four channels with two virtual cn M rt 5 A , ™ *w ~z1~^~a 
channels each. A set bit in the virtual channel mask desig- J^^ iT^^ ™, *??T "5 
nates the corresponding channel module as a path. The least ln [ 0I *" age m common buffer P 001 of ** mtcrconncct 
• -a . * , ° . , , , « . 10 controller, 
significant 4 bits specify virtual channel 0 while the most _ 

significant 4 bits specify virtual channel 1 and all 0 entry 0ne of ^ blts m me contro1 P 0 ^ 011 of * c P 1 ^ ** 

designates parallel pert A. while all l's entry designates a piggy-back acknowledgment of a previously sent packet in 

parallel port B. The virtual channel mask, priority and the me same slot b y ttc receiving channel module. If the 

virtual channel contained in the prefix are used to determine „ acknowledgment bit checks out property such that the 

into which queue the packet will be appended. The queue rspSeqNo * the inverse of the msgSeqNo then the packet 

structure uses the linked lists hardware described previously. state ^&ster bu ffer for that packet is cleared and a control 

Enqueue and dequeue operations may overlap such that Slgnal . is ^ ^ common P ac ^bittTer pool thus freeing 

sending of a packet may commence before it has entirely s P ace ^ the buffer pool once apacket has been successfully 

beenreceivedBecauseofthisoveriap,acheckcodeerroron 20 ^nveyed from a given interconnect controller, 

an incoming packet causes an abort bit to be set in the 10 is provided to illustrate in more detail the channel 

corresponding outgoing packet so that the receiving logic of module logic of the present invention. The illustration of 

the final node is able to discard the packet when it all arrives 10 incorporates both the transmission and receiving 

at one place. portions of the channel module as well as the control logic 

All the operations concerning receiving routing « packet State re S feter me ' ^ contro1 P 01 * 011 of fce 

enqueuingfdequeuing take place concurrently at a rate that channd maintenance subsystem is illustrated separate from 

matches the total packet throughput rate. This performance me decodin g m <* me encoding circuitry but provides the 

can be achieved because all packets are of equal size and controls for processing of signals as described above. Also 

arrive at precisely scheduled time slots. illustrated in FIG. 10 are the FIFO buffer 77 and temporal 

FIG. 9 is provided to demonstrate the operation of a 30 ^8™?* bu ff ***** f e synchronizing sig- 

channel module and channel maintenance subsystem from ^ * mt f ^Pf^et transmission times, 

both the transmitting end and receiving end of a packet *? G ' 10 * us * ates » the preferred embodiment these 

transfer transaction. From the transmitting channel module, clci ° ents *** "«*P°nitcd on the packet receiving side of the 

a data packet is received in 16-bit increments from the mechanism. 

packet buffer pool. To each data packet the check code is 35 FIG - 11 shows in more detail a portion of a channel 

calculated and appended by the CRC generator 52. The maintenance subsystem. Of particular interest here is the 

multiplexing logic and check code generation circuit are FIFO RAM 77 which is a buffer for receiving packet signals, 

driven by the channel modules control logic 55 which also 0nc fraction of FIFO RAM 77 will be discussed further 

communicates with the packet buffer register file and its nerein fe its use for synchronizing clocks between linked 

associated logic. Within the control logic for each channel 40 cnanncl modules by monitoring the depth of the data stored 

module there exists a collection of data registers or buffers m me buffer. 

with information about each slot in the packet "conveyor FIG. 12 is provided to illustrate a graphical demonstration 

belt" which circulates between connected channel modules. of the conveyor belt nature of the slotted packet exchange 

From the channel module, the data packet proceeds to the between linked channel modules. As was described, the 

channel maintenance subsystem which by means of encod- 45 transmission time between adjacent channel modules is 

ing logic 58 serializes the data for transmission as a serial bit adjusted by means of the temporal alignment buffer to 

stream across the chosen communications media such as provide exactly an integral number of packet transmission 

fiber optic cable or twisted pair cable. times for the total round-trip delay. In the illustration of FIG. 

A channel module on the receiving end of a packet 12, a ten slot communication link is illustrated. There is 

transmission receives the 96 bits of the packet data with 50 provided at both channel modules a packet state buffer 

appended postfix data as a serial bit stream. The channel register file which provides for 10 different entries in the 

maintenance subsystem at decoding logic 68 of the receiving case of FIG. 12, each corresponding to one slot in the link, 

channel modules decodes the incoming packet. The encod- Iff a Imk is of a length such that there are more slots than 

ing logic 58 is implemented at design time to follow a available storage space in the state buffer register file, idle 

predetermined encoding algorithm. The decoding logic of 55 packets are inserted to insure that a state buffer is available 

the receiving channel module must be selected for decoding for each useful slot. 

the chosen encoding scheme. The packet is then aligned to Logically, the interconnect controller has six packet 

16-bit double words at packet alignment circuit 69. The sources and six packet drains that are connected through a 

packet alignment circuit 69 includes a queue to absorb clock set of 40 queues. For each outbound channel, there is one 

jitters and the variable delay element the temporal align- 60 queue for each combination of four priority levels and two 

ment buffer 69, that is used to adjust the round-trip delay to virtual channels. The transmitters serve their queues in strict 

an integral multiple of the routing cycle. The temporal priority order: as long as there is pending traflic in a high 

alignment buffer 69 is used to adjust the signal transit time priority queue, no lower priority queue will be served. The 

to be in integral multiple of the packet transmission time. four serial channels have actually two sets of independent 

This is achieved by adding the delay element in the receive 65 queues, one for each of two virtual channels that do not 

data path, which is essentially a 16-bit wide shift register of interfere with each other. Channels alternately serve their 

depth 0-5. The depth is set during the initialization proce- virtual channels fairly. However, if a virtual channel has no 
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pending traffic, the entire channel capacity is available to the eventually acquire lock. The "PLL" circuit in FIG. 13 tries 

other virtual channel. Virtual channels are used to avoid to achieve phase lock among a distributed set of oscillators 

deadlocks and are identified when the routing tables are and probably would not work with plain phase detectors. 

*^ e( *" A typical frequency/phase comparator is outlined in FIG. 

A packet transmission is not assumed to be error free, 5 14. This circuit is easily integrated with current CMOS 

rather the channel module has to verify packet integrity technology. The output acts as a charge pump and is meant 

based on the check codes appended in the postfix. Packets to be connected to a high capacitance node. The two output 

which are rejected by a channel module due to lack of transistors 90 and 91 are very small and act as current 

buffers or a corrupted packet are placed in a reject queue by sources that are briefly turned on to add or remove charge 

the sender. Rejected packets go through a routing cycle to 10 from the output node, which becomes part of the subsequent 

determine a new transmit channel and are then inserted at the low pass filter 95 (FIG. 13). This has the advantage that the 

head of the corresponding outbound queue. Higher priority outputs of several comparators can be tied together to form 

is given to rejected packets in order to reduce the average a cumulative integrator. In the locked state, the net output 

latency variation. Receive errors also corrupt the piggy-back current becomes 0. Multiple units can be integrated on one 

acknowledgment which causes the original packet sent in a " cn ip such that both output transistors are disabled if the 

corresponding time slot to be retransmitted In the event of corresponding channel is unused or loses synchronization, 

a duplicate transmission the receiver will discard the packet K h to note that the VCO 96 of FIG. 13 is 

since ; the sequence ID bit of the incoming packet will not crystal based. The operating frequency of a crystal oscillator 

match the expected sequence ID bit is very weU deflned To ierances of 1CT 6 are not uncommon. 

The channel module protocol has the property that it 20 The assumption that the open loop center frequencies over 
neither drops or duplicates packets. In the presence of the operating temperature range of all interconnect control- 
transmission errors, the determination that a transfer is lers are within +10" 4 of the design specification is quite 
completed depends on a number of errors that occur. Once conservative. Crystal oscillators can be electrically tuned 
it is sent, a packet and its associated buffer are locked by the within a narrow range of about ±10~ 3 -;fo- Such a voltage 
channel module until it can be determined that the transfer 25 controlled (crystal-) oscillator is a good local clock source 
succeeded or failed. If a packet transmission has failed, the even if the control loop is disabled or in an arbitrary state, 
packet is returned to r outer and a new routing decision The above discussion with respect to the distribution of 
is made. Each of these failures causes the age field of the c i ock signals can be found in Nowatzyk, A., Communica- 
^^^^^ 30 tions Architecture for Multi-Processor Networks. Ph.D. 
retry. When the age field has been saturated, a packet is Dissertation, Carnegie Mellon University, December 1989, 
to toe sending node which attempts again to route wnicn is mcorp0 rated herein by reference. In summary, an 
the traffic. If the age field is twice saturated, the local host interconnect controller recovers all clocks received from 
of the final samratmg node is notified for error processing or channcl modules to which it is connected and compares 
a user is alerted. ^ them to its own clock, averages over the comparison and 

Because the communications protocol of the present then uses the average to adjust its own clock With all 

invention requires each channel module to know exactly interconnect controllers doing this, after a brief transient 

when packets are going to be received, it is necessary that all start-up time, the overall network will operate on a single 

the interconnect controllers share a global clock so that all distributed global clock. 

40 Onerefinementmcorpc^tedmthepresentinventionisto 

toning of packet exchanges. A single dock source eliminates use me prpo buffer 77 and phase logic 78 of the channel 

*e need for synchronies in many places most notably in maimer subsystem of FIG. 11 as the phase/frequency 

the communication channel^ This avoids delays and comparator. The FIFO depth will change witti respect to the 

reduces the potential for intermittent failures. synchronization with the clock of a connected channel 

For the reasons discussed above, the clocks of all inter- 45 module. That is, the FIFO buffer 77 of a given channel 

connect controllers need to be synchronized. However, no module will begin to fill up if that channcl module is running 

tight bounds on the clock skew wfll be required, so that the slower than the channel module from which it is receiving 

clocks of two different clusters may vary their phase as long data. Likewise, the FIFO buffer will begin to empty for a 

as their average phase relation is constant and transient channel module that is running raster than the channel 

r^ase differences do not exceed of the packet transmit 50 module it is coupled to. The deviation from the ft full state 

sion time. of me FIFO is proportional to the phase difference of the 

There are no clock wires connecting interconnect clocks of the sending and receiving nodes. If there exists a 

controllers, instead clocks are recovered from the data difference in frequency between these nodes, the rate of 

transmissions. These transmissions are synchronous and FIFO overflows/underflows is proportional to the frequency 

continuous so that a reconstructed clock is always available. 55 difference. Provided that the FIFO controller does not wrap 

Periods with no data to be transmitted are filled with idle around (an overflow cannot result into the empty FIFO 

packets that exchange low priority status information. At state), the FIFO depth is a measure of both phase and 

each interconnect controller, the recovered clocks of the frequency differences. 

incoming channels are compared to the local clock as FIG. 15 illustrates graphically by means of a truth table 

illustrated by FIG. 13. 60 how the FIFO buffer depth is translated into an analog signal 

Each recovered clock (IN-©, . . . , EN-n) is fed into that reflects the phase/frequency difference. This signal is 

frequency/phase comparator. Frequency/phase comparators externally low-pass filtered by LPF 82 and used to control 

differ from plain phase comparators in their ability to the voltage controlled oscillator 83 that supplies the clock 

compare the frequency of uncorrelated signals. Plain phase for the system. By adjusting the values of the resistors, a 

comparators, are simpler and more precise in the locked 65 piece-wise approximation of a cubic transfer function' is 

state. However, they produce random output signals if the realized which causes faster convergence to a common 

input signals are uncorrelated. A normal PLL circuit will operating frequency which is biased toward the average of 
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the free running center frequencies of all nodes. Ideally, the zatdon at decision box 202 a channel module receives a 

FIFO buffer should remain half-full in synchronized opera- "probe" word (C2) from a channel module to which it is 

tion. The monitoring of the FIFO buffer depth is carried out coupled then at step 203 the channel module receiving the 

by the phase logic 78 coupled to the FIFO 77 (FIG. 11). The "probe" word sends a '*probe-echo" (C3) word back to the 

phase logic drives the FIFO depth result to a digital-to- 5 probing channel module. 

analog converter (not shown) which supplies the input to Once the system has stabilized and if no probe word has 

82- rece ived by the channel module then at step 204, the 

In the preferred embodiment, the FIFO buffer 77 and channel module will initiate the sending of a "probe" (C2) 

temporal alignment buffer 69 are implemented with a com- word to the channel module to which it is coupled. The 

mon memory with separate read and write ports. 10 "probe" (C2) word is an out of band signal which causes the 

The timing circuit 42 for an interconnect controller as remote chaanel module to senda**probe-echo" (C3) word in 

illustrated in FIG. 7 includes a counter which maintains a response. Sending of both the "probe" and "probe-echo" 

current value corresponding to the real-time maintained by words are synchronized to the routing cycle of the sending 

the interconnect controller. One aspect of the present inven- channel modules. At step 205, if no signal is received from 

tion is that diagnostic packets may be propagated between 15 the remote channel module then the routine loops until either 

coupled channel modules with a header then indicates that the receipt of the "probe-echo" word or until the receipt of 

they are such diagnostic packets. A time stamp may be a "probe" word from a remote channel module that may 

included in such a packet which indicates the time at the nave scnt its "probe" word at the same time. At step 206, the 

transmitting interconnect controller when the packet is sent receiving of either signal, "probe" or "probe-echo" , provides 

Since the receiving packet knows the delay in time between 20 ability to set the temporal alignment delay so that the 

transmission and receipt of exchanged packets, it is capable probe words arrive synchronized to the respective routing 

of determining in its logic what time the resident timer cycles. If at step 207 it turns out that a channel module which 

counter should have. This counter may be programmed by sent the "probe" word receives back a "probe" word and not 

the interconnect controller logic and adjusted atomically ^ u probe-echo" wont a form of deadlock in determining 

such that all coupled interconnect controllers can establish a 25 which of the two modules is the master for initialization 

consistent time. purposes occurs. This case is resolved with a probabilistic 

Another aspect of programmable interconnect controllers P 1 *** 1 at ste P 208 ^ <^h side randomly decide 

is that they may be operated remotely by transmitted control whether or * ot . * *? ho 9 ° nce ™ 

signds.TTiecontrollogfc^^ „ asymmetnc decision u made (via MectromcaUy flipping a 

to be used in distributed topology exploration algorithms. In 30 c^in") meimtializarion 

this case, multiple nodes may start independently to map the of m f dcasion 13 rccordcd » Master/Slave bits described 

interconnect system by incrementally adding nodes to the previously. 

explored domain. The locks in each node prevent a node to Th e same procedure described above may be used for 

be mapped by two different mapping agents. Instead, they 35 ^synchronization after a transient synchronization loss, 

will notice that more than one mapping agent is active and Either node may stop normal transmission and start sending 

an arbitration process will be used to merge the two domains the CI signal 

with only one mapping agent continuing the process. FIG. 17 illustrates graphically the signal exchanges as 

Additionally, the interconnect controller logic should be described by the Appendix A program listing for initializa- 

equipped with a watchdog timer such that a node, if put into 40 tion. In this illustration, CI represents the abort or initial 

a lock state, will emerge and wait in an accessible state for synchronization pattern. C2 represents the "probe" word and 

commands after some predetermined amount of time. Thus, C3 represents the * 'probe-echo" word as described with 

where a node having an interconnect controller is in an respect to the procedure of FIG. 16. 

unreachable location, it cannot be permanently disconnected An interconnect controller and communications protocol 

from the network with no hope of recovering access to it 45 have been described for use by a node in an arbitrary 

This watchdog timer will activate whenever the node is in topology collection of nodes in a network suitable for use for 

some sort of critical state and will wake it up to an accessible both data sharing and distributed computing. Although the 

state after a predetermined amount of time. This facility also present invention has been described in terms of preferred 

provides for remote configuration of interconnect control- embodiments, it will be appreciated that various modifica- 

l crs - 50 tions and alterations might be made by those skilled in the 

Finally, it is instructive to discuss how upon initialization art without departing from the spirit and scope of the 

each interconnect controller determines such things as the invention. The invention should, therefore, be measured in 

round-trip delay between coupled channel modules. FIG. 16 terms of the claims which follow, 

is a flowchart illustrating the initialization procedure that is We claim: 

carried out by the logic of the interconnect controller of the 55 1. For use by nodes in an arbitrary topology collection of 

present invention. Appendix A illustrates a verilog program nodes wherein each node may have a plurality of cornmu- 

listing which can be used to generate the logic of the present nications channels and a plurality of adjacent nodes to each 

invention. of which the node is coupled through a single communica- 

FIG. 16 illustrates the initialization procedure 200. Ini- tions channel, respectively, each of said nodes having an 

tially upon power-up reset at step 201 each channel module 60 interconnect controller having means for controlling the 

sends a constant "idle" bit pattern which is an out of band exchange of data packets having a length of (W) bits over 

signal (CI). Channel modules that receive this pattern assert communications channel, wherein to transmit a packet hav- 

a signal detect bit which indicates that the channel module ing (W) bits plus (X) appended control bits requires a time 

is coupled to another channel module and is not a non- CO, the method of exchanging data packets between adja- 

connected port. The initialization procedure waits a pro- 65 cent nodes comprising the steps of; 

grammable delay to allow stabilization of the distributed adjusting the round trip delay (Dij) for data packets 

phase lock loop and clock distribution. If after synchroni- transmitted between adjacent nodes i and j to equal an 
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integral multiple of the packet transmission time (T) 
thus providing for DijfT packet transmission slots 
between adjacent nodes i and j; 

maintaining a status table for packet transmission slots 
between adjacent nodes; 

inserting null data packets when there are more transmis- 
sion slots between adjacent node than there are entries 
in said status table; 

each Dode receiving data packets from all adjacent nodes 
through all coupled communications channels and stor- 
ing said data packets in a common buffer pool shared 
by a given node's communications channels; 

extracting said check code bits from a received data 
packet; 

determining if said data packet was accurately received; 
acknowledging the accurate receipt of said data packet to 

the sending node if said data packet is accurately 

received, 

toggling a sequence bit in a returning packet occupying 
the same transmission slot as the received packet; 

requesting the retransmission of said data packet from the 
sending node if said data packet was not accurately 
received; 

queuing packets stored in said common buffer pool of a 
node for transmission to an adjacent node through an 
appropriate communications channel where said selec- 
tion of said appropriate communications channel is 
determined by indexing a destination id included in 
said data packet into a routing table; 

assigning data packets in said routing table to channel 
modules having the fewest pending transactions; 

extracting a data packet from said common buffer pool for 
transmission through said selected communications 
channel; 

determining check code bits for said data packet based on 

the content of said data packet; 
appending said data packet with said check code bits; and 
continuously conveying data packets between adjacent 

nodes through isochronous coupled communications 

channels by conveying a data packet upon receiving a 

data packet 

2. An interconnect controller for use in a first node in an 
arbitrary topology collection of nodes for controlling point- 
to-point data packet exchanges between said first node and 
adjacent nodes, said data packets having a length of (W) bits, 
said interconnect controller comprising: 
a plurality of communications ports comprising at least 
first and second communications ports for receiving 
and conveying said data packets between said first node 
and said adjacent nodes, said first and second commu- 
nications ports coinprising serial ports for conveying 
and receiving data packets one bit at a time, said data 
packets comprising (W) bits plus (X) control informa- 
tion bits wherein the packet transmission time for 
launching a packet of W+X bits from one of said serial 
ports to an adjacent interconnect controller requires a 
time (T); 

said data packets including packet age identification bits 
incremented to indicate the occurrence of certain con- 
ditions including delivery failure; 

a plurality of channel modules each coupled to one of said 
plurality of communications ports, respectively, for 
controlling the flow of said data packets into and out of 
said interconnect controller wherein each of said chan- 
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nel modules may be coupled to a channel module of an 
adjacent node through interconnect controllers imple- 
mented in said adjacent nodes, said coupled channel 
modules of two adjacent nodes continuously exchang- 

5 rng a flow of data packets through an isochronous 
communications channel; 
timing control logic means incorporated in each of said 
plurality of channel modules for adjusting the round 
trip delay (Dij) of packets exchanged between coupled 

1° adjacent channel modules to equal an integral number 
of T transmission times where Dij is the round trip time 
for a data packet to travel from a node i to a node j and 
back to node i, said timing control logic means includ- 
ing a shift register through which received data packets 

15 pass, said shift register having a variable depth which 
is set to adjust the round trip delay (Dij) for packets 
exchanged between adjacent channel modules to be an 
integral multiple of packet transmission time (T); 
a clock means and means for synchronizing said clock 

20 means with the clock means incorporated into the 
interconnect controllers of adjacent nodes, said means 
for synchronizing said clock means including: 
a FIFO buffer through which received data packets 
pass; 

25 phase logic means for measuring the depth of data 
maintained in said FIFO buffer wherein said FIFO 
buffer depth is an indication of the phase/frequency 
relationship between adjacent interconnect control- 
lers; 

30 clock speed adjustment means responsive to said phase 
logic means for adjusting the synchronizing the 
clock means of said interconnect controller with the 
clock means of said adjacent interconnect controller; 
35 a common buffer pool coupled to said plurality of channel 
modules for buffering incoming and outgoing data 
packets; and 

routing table logic in communication with said common 
buffer pool and said plurality of channel modules for 
40 routing data packets through appropriate channel mod- 
ules. 

3. The interconnect controller of claim 2 wherein said 
clock speed adjustment means comprises: 

means for converting said FGDPO buffer depth to an analog 
45 equivalent signal; 

low pass filter means for receiving and low pass filtering 

said analog equivalent signal; and 
a voltage controlled oscillator couple to said low pass 
filter for generating a control signal to adjust the clock 
50 means of said interconnect controller. 

4. The interconnect controller of claim 3 wherein said 
timing control logic means and said FIFO buffer comprise a 
common memory means with separate read and write potts. 

5. The interconnect controller of claim 4 further corapris- 
55 ing means for remotely setting said clock means. 

6. An interconnect controller for use in a first node in an 
arbitrary topology collection of nodes for controlling point- 
to-point data packet exchanges between said first node and 
adjacent nodes, said data packets having a length of (W) bits, 

60 said interconnect controller comprising: 

a plurality of communications ports comprising at least 
first and second communications ports for receiving 
and conveying said data packets between said first node 
and said adjacent nodes, said first and second commu- 

65 nications ports comprising serial ports for conveying 
and receiving data packets one bit at a time, said data 
packets comprising (W) bits plus (X) control informa- 



02/12/2004, EAST Version: 1.4.1 



5,754,789 



19 



tion bits wherein the packet transmission time for 
launching a packet of W+X bits from one of said serial 
ports to an adjacent interconnect controller requires a 
time (T); 

said data packets including packet age identification bits 
incremented to indicate the occurrence of certain con- 
ditions including delivery failure, said data packets are 
deleted if said packet age identification bits indicate 
packet age to be beyond a predetermined value; 

a plurality of channel modules each coupled to one of said 
plurality of communications ports, respectively, for 
controlling the flow of said data packets into and out of 
said interconnect controller wherein each of said chan- 
nel modules may be coupled to a channel module of an 
adjacent node through interconnect controllers imple- 
mented in said adjacent nodes, said coupled channel 
modules of two adjacent nodes continuously exchang- 
ing a flow of data packets through an isochronous 
communications channel; 

timing control logic means incorporated in each of said 
plurality of channel modules for adjusting the round 20 
trip delay (Dij) of packets exchanged between coupled 
adjacent channel modules to equal an integral number 
of T transmission times where Dij is the round trip time 
for a data packet to travel from a node i to a node j and 
back to node i; 

a common buffer pool coupled to said plurality of channel 
modules for buffering incoming and outgoing data 
packets; and 

routing table logic in communication with said common 
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7. A global clocking apparatus including a global clock for 
clocking a plurality of nodes, said apparatus comprising: 

a FIFO buffer in said first node which receives data 
packets from a second node; 

a phase logic circuit measuring the depth of data main- 
tained in said FIFO buffer and using the depth of said 
data in said FIFO buffer to indicate whether said first 
node is running faster or slower than said second node; 

an adjusting circuit in said first node which uses the output 
of said phase logic circuit to synchronize a local clock 
in said first node with said global clock. 

8. The global clocking apparatus of claim 7 further 
comprising: 

a circuit which converts the depth of data in said FIFO 
buffer to an analog equivalent signal; 

a low pass filter filtering the analog equivalent signal; and 

a voltage controlled oscillator receiving an output of the 
low pass filter and generating a control signal to adjust 
the local clock. 

9. The global clocking apparatus of claim 7 wherein the 
FIFO buffer includes a common memory with separate read 
and write ports. 

10. The global clocking apparatus of claim 7 wherein the 
adjusting circuit includes a timer which places the first node 



buffer pool and said plurality of channel modules for 30 in an accessible state after a predetermined amount of time, 
routing data packets through appropriate channel mod- 
ules. 
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